A Language Independent Approach for Detecting Duplicated Code

نویسندگان

  • Stéphane Ducasse
  • Matthias Rieger
  • Serge Demeyer
چکیده

Code duplication is one of the factors that severely complicates the maintenance and evolution of large software systems. Techniques for detecting duplicated code exist but rely mostly on parsers, technology that has proven to be brittle in the face of different languages and dialects. In this paper we show that is possible to circumvent this hindrance by applying a language independent and visual approach, i.e. a tool that requires no parsing, yet is able to detect a significant amount of code duplication. We validate our approach on a number of case studies, involving four different implementation languages and ranging from 256 K up to 13Mb of source code size.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Lightweight Detection of Duplicated Code — A Language-Independent Approach

Duplicated code can have a severe, negative impact on the maintainability of large software systems. Techniques for detecting duplicated code exist but they rely mostly on parsers, technology that is often fragile in the face of different languages and dialects. In this paper we show that a lightweight approach based on simple string-matching can be effectively used to detect a significant amou...

متن کامل

An Algorithm for Detecting and Removing Clones in Java Code

This paper proposes a new algorithm for automatically detecting and removing duplicated code in existing Java programs. Its purpose is to improve the structure of small code snippets (as in refactoring), rather than reducing the overall redundancy in huge legacy programs. As such, approaches that favor code clarity over efficiency are introduced. The skeleton of our algorithm is presented and i...

متن کامل

A Comparison of Similarity Techniques for Detecting Source Code Plagiarism

Academic dishonesty is a universal problem. Detecting duplicated text among natural language artifacts is a welldocumented task. However, performing similar analysis on source code presents unique problems. In this paper, I present a comparison of the application of various techniques in textual similarity processing on source code. Beyond this, I investigate the application of textual similari...

متن کامل

Detecting Code Clones: A review

Code clone detection is involved with detecting duplicated fragments of code within a code base. Detecting these clones is useful for maintenance operations which require editing the clones. The tools developed are expected to be robust enough to identify clones even when they have been modified, whilst preserving reasonable recall and precision rates. It is also expected that these tools be ea...

متن کامل

Similar Code Detection and Elimination for Erlang Programs

A well-known bad code smell in refactoring and software maintenance is duplicated code, that is the existence of code clones, which are code fragments that are identical or similar to one another. Unjustified code clones increase code size, make maintenance and comprehension more difficult, and also indicate design problems such as a lack of encapsulation or abstraction. This paper describes an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999